Selecting Near-Optimal Learners via Incremental Data Allocation

نویسندگان

Ashish Sabharwal

Horst Samulowitz

Gerald Tesauro

چکیده

We study a novel machine learning (ML) problem setting of sequentially allocating smallsubsets of training data amongst a large set of classifiers. The goal is to select a classifierthat will give near-optimal accuracy when trained on all data, while also minimizing the costof misallocated samples. This is motivated by large modern datasets and ML toolkits withmany combinations of learning algorithms and hyper-parameters. Inspired by the principle of“optimism under uncertainty,” we propose an innovative strategy, Data Allocation using UpperBounds (DAUB), which robustly achieves these objectives across a variety of real-world datasets.We further develop substantial theoretical support for DAUB in an idealized setting wherethe expected accuracy of a classifier trained on n samples can be known exactly. Under theseconditions we establish a rigorous sub-linear bound on the regret of the approach (in termsof misallocated data), as well as a rigorous bound on suboptimality of the selected classifier.Our accuracy estimates using real-world datasets only entail mild violations of the theoreticalscenario, suggesting that the practical behavior of DAUB is likely to approach the idealizedbehavior.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cooperative Negotiation in Autonomic Systems using Incremental Utility Elicitation

Decentralized resource allocation is a key problem for large-scale autonomic (or self-managing) computing systems. Motivated by a data center scenario, we explore efficient techniques for resolving resource conflicts via cooperative negotiation. Rather than computing in advance the functional dependence of each element’s utility upon the amount of resource it receives, which could be prohibitiv...

متن کامل

The Effects of Technical and Organizational Activities on Redundancy Allocation Problem with Choice of Selecting Redundancy Strategies Using the memetic algorithm

Redundancy allocation problem is one of most important problems in reliability area. This problem involves with the suitable redundancy levels under certain strategies to maximizing system reliability under some constraints. Many changes have been made on this problem to draw the problem near to real situations. Selecting the redundancy strategy, using different system configuration are some of...

متن کامل

L2 Writing Feedback Preferences and Their Relationships with Entity vs. Incremental Mindsets of EFL Learners

The present study was aimed at investigating intermediate Iranian EFL learners’ feedback preferences on their L2 writing and examining the possible differences between learners with entity and incremental language mindsets with respect to their feedback preferences. To this end, 150 EFL learners were recruited from several language institutes in Isfahan, Iran, and their language proficiency lev...

متن کامل

Near-Optimal Bayesian Ambiguity Sets for Distributionally Robust Optimization

We propose a Bayesian framework for assessing the relative strengths of data-driven ambiguity sets in distributionally robust optimization (DRO) when the underlying distribution is defined by a finite-dimensional parameter. The key idea is to measure the relative size between a candidate ambiguity set and a specific, asymptotically optimal set. This asymptotically optimal set is provably the sm...

متن کامل

Near-Optimal Bayesian Ambiguity Sets for 3 Distributionally Robust Optimization 4

We propose a Bayesian framework for assessing the relative strengths of data-driven ambiguity sets in distributionally robust optimization (DRO) when the underlying distribution is defined by a finite-dimensional parameter. The key idea is to measure the relative size between a candidate ambiguity set and a specific asymptotically optimal set. As the amount of data grows large, this asymptotica...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Selecting Near-Optimal Learners via Incremental Data Allocation

نویسندگان

چکیده

منابع مشابه

Cooperative Negotiation in Autonomic Systems using Incremental Utility Elicitation

The Effects of Technical and Organizational Activities on Redundancy Allocation Problem with Choice of Selecting Redundancy Strategies Using the memetic algorithm

L2 Writing Feedback Preferences and Their Relationships with Entity vs. Incremental Mindsets of EFL Learners

Near-Optimal Bayesian Ambiguity Sets for Distributionally Robust Optimization

Near-Optimal Bayesian Ambiguity Sets for 3 Distributionally Robust Optimization 4

عنوان ژورنال:

اشتراک گذاری